Back

BMC Genomics

15 training papers 2019-06-25 – 2026-03-07

Top medRxiv preprints most likely to be published in this journal, ranked by match strength.

1
bistro: An R package for vector bloodmeal identification by short tandem repeat overlap
2023-09-15 epidemiology 10.1101/2023.09.14.23295566
#1 (4.9%)
Show abstract

O_LIMeasuring vector-human contact in a natural setting can inform precise targeting of interventions to interrupt transmission of vector-borne diseases. One approach is to directly match human DNA in vector bloodmeals to the individuals who were bitten using genotype panels of discriminative short tandem repeats (STRs). Existing methods for matching STR profiles in bloodmeals to the people bitten preclude the ability to match most incomplete profiles and multi-source bloodmeals to bitten indivi...

2
Phenotyping genetic differences in aldehyde dehydrogenase 2 after an alcohol challenge in humans
2020-06-29 genetic and genomic medicine 10.1101/2020.06.26.20137513
#1 (2.8%)
Show abstract

Inefficient aldehyde metabolism by an aldehyde dehydrogenase 2 (ALDH2) genetic variant, ALDH2*2 (rs671), increases the risk of esophageal cancer with alcohol consumption. Here we tested the hypothesis that additional genetic differences in ALDH2 besides ALDH2*2 exist resulting in inefficient acetaldehyde metabolism after alcohol consumption. Human volunteers were recruited who self-reported flushing after alcohol. The first stage recruited East Asians and the second stage non-East Asians. After...

3
How much should we sequence? An analysis of the Swiss SARS- CoV-2 surveillance effort
2023-08-31 public and global health 10.1101/2023.08.28.23294715
#1 (2.7%)
Show abstract

BackgroundDuring the SARS-CoV-2 pandemic, many countries directed substantial resources towards genomic surveillance to detect and track viral variants. There is a debate over how much sequencing effort is necessary in national surveillance programs for SARS-CoV-2 and future pandemic threats. AimWe aimed to investigate the effect of reduced sequencing on surveillance outcomes in a large genomic dataset from Switzerland, comprising more than 143k sequences. MethodsWe employed a uniform downsamp...

4
Frequentmers - a novel way to look at metagenomic Next Generation Sequencing data and an application in detecting liver cirrhosis
2023-09-19 genetic and genomic medicine 10.1101/2023.09.19.23295771
#1 (1.9%)
Show abstract

Early detection of human disease is associated with improved clinical outcomes. However, many diseases are often detected at an advanced, symptomatic stage where patients are past efficacious treatment periods and can result in less favorable outcomes. Therefore, methods that can accurately detect human disease at a presymptomatic stage are urgently needed. Here, we introduce "frequentmers"; short sequences that are specific and recurrently observed in either patient or healthy control samples, ...

5
Variant calling pipelines for whole exome sequencing in clinical context
2024-10-19 genetic and genomic medicine 10.1101/2024.10.18.24315708
#1 (1.9%)
Show abstract

1IntroductionWhole exome sequencing (WES) has become a more accessible diagnostic tool in clinical genetic context, leading to the debate of the most accurate and effective bioinformatic pipeline solutions to evaluate variants that explain diseases. ObjectiveThis study aimed to evaluate twenty-four pipelines in two samples comparing accuracy, time and computing efficiency. We also contrasted the results based on regions in two of the most common capture kits. Materials and methodsWe used two a...

6
Complete genomic profiles of 1,496 Taiwanese reveal curated medical insights
2021-12-30 genetic and genomic medicine 10.1101/2021.12.23.21268291
#1 (1.9%)
Show abstract

BackgroundTaiwan Biobank (TWB) project has built a nationwide database to facilitate the basic and clinical collaboration within the island and internationally, which is one of the valuable public datasets of the East Asian population. This study provided comprehensive genomic medicine findings from 1,496 WGS data from TWB. MethodsWe reanalyzed 1,496 Illumina-based whole genome sequences (WGS) of Taiwanese participants with at least 30X depth of coverage by Sentieon DNAscope, a precisionFDA cha...

7
Machine Learning Approaches to Predict Alcohol Consumption from Biomarkers in the UK Biobank
2024-12-24 psychiatry and clinical psychology 10.1101/2024.12.22.24319486
#1 (1.9%)
Show abstract

BackgroundMeasuring and estimating alcohol consumption (AC) is important for individual health, public health, and Societal benefits. While self-report and diagnostic interviews are commonly used, incorporating biological-based indices can offer a complementary approach. MethodsWe evaluate machine learning (ML) based predictions of AC using blood and urine-derived biomarkers. This research has been conducted using the UK Biobank (UKB) Resource. In addition to the prediction of the number of alc...

8
ACCIO: An Assembly-Based Tool Enabling Plasmid Detection
2025-11-02 infectious diseases 10.1101/2025.10.30.25338662
#1 (1.9%)
Show abstract

2.Plasmids are extrachromosomal mobile genetic elements that often carry genes responsible for antimicrobial resistance. Plasmid epidemiology aims to track the evolution and spread of plasmids, but the field currently faces significant barriers that make practical implementation using whole genome sequence data difficult. Hybrid-assembled genomes remain the most reliable way to identify and track complete plasmids; however, most genomic surveillance data exists in the form of short-read sequenci...

9
Genetics and Epigenetics of Aldehyde Dehydrogenase (ALDH2) in Alcohol Related Liver Disease
2021-04-18 genetic and genomic medicine 10.1101/2021.04.16.21255566
#1 (1.9%)
Show abstract

Alcohol dependence and cirrhosis are key outcomes of excessive alcohol use. We studied the interaction between genetics and epigenetics at the aldehyde dehydrogenase (ALDH2) locus to understand differences in vulnerability to cirrhosis. Individuals were selected according to ICD 10 criteria for Alcohol dependence with Cirrhosis (AUDC+ve, N=116) and Alcohol dependence but without Cirrhosis; (AUDC-ve, N=123) from the clinical services of Gastroenterology and Psychiatry at the St Johns Medical Coll...

10
Development and validation of an all-in-one Bat-Clade genomic sequencing and host identification protocol
2024-11-04 public and global health 10.1101/2024.10.31.24316523
#1 (1.9%)
Show abstract

Rabies virus (RABV), a fatal zoonotic pathogen, remains a significant public health concern, with bat-maintained lineages accounting for all currently documented cases in Brazil. Despite the availability of pharmacological prophylaxis for humans and animals, the high genetic diversity of RABV in diverse natural bat hosts and continued circulation in multiple animals pose challenges for effective surveillance. Here, we developed and validated a novel, rapidly deployable amplicon-based sequencing ...

11
Tracing SARS-CoV-2 Clusters Across Local-scales Using Genomic Data
2024-09-22 infectious diseases 10.1101/2024.09.18.24313896
#1 (1.8%)
Show abstract

Quantitatively understanding local transmission dynamics is essential for designing effective prevention strategies. In this study, we developed a novel algorithm to identify introductions and trace locally circulating clusters. We analyzed over 26,000 SARS-CoV-2 genomes and their associated metadata, collected between January and October 2021, to explore introduction and dispersal patterns in Greater Houston, a major metropolitan area known for its demographic diversity. Our analysis identified...

12
Statistical Design and Analysis of PCR Tests for Fast Mutating Viruses
2021-04-09 infectious diseases 10.1101/2021.04.07.21254917
#1 (1.8%)
Show abstract

As the SARS-CoV-2 virus mutates, mutations harboured in patients become increasingly diverse. Patients classified into two strains may have overlapping non-variant-defining mutations. Mutation calling by sequencing is relative to a reference genome. As SARS-CoV-2 mutates, tracking emerging mutant strains may become increasingly problematic if the reference genome remains Wuhan-Hu-1, because the comparison then becomes indirect: current dominant strain relative to Wuhan-Hu-1 versus emerging stra...

13
cocci-call: a species-aware variant identification pipeline for Coccidioides spp. in genomic epidemiology
2025-01-14 epidemiology 10.1101/2025.01.14.25320518
#1 (1.8%)
Show abstract

Emerging fungal pathogens, such as Coccidioides, the causative agent of Valley fever, pose significant clinical and public health challenges. While advances in genomic epidemiology have enhanced our understanding of Coccidioides evolutionary history, the lack of standardized tools for variant identification makes it difficult to draw comparisons between studies. To address this gap, we developed and benchmarked a novel, publicly available pipeline, cocci-call, designed for genome-wide variant id...

14
Genomics-informed outbreak investigations of SARS-CoV-2 using civet
2021-12-14 epidemiology 10.1101/2021.12.13.21267267
#1 (1.8%)
Show abstract

The scale of data produced during the SARS-CoV-2 pandemic has been unprecedented, with more than 5 million sequences shared publicly at the time of writing. This wealth of sequence data provides important context for interpreting local outbreaks. However, placing sequences of interest into national and international context is difficult given the size of the global dataset. Often outbreak investigations and genomic surveillance efforts require running similar analyses again and again on the late...

15
A Tale of Two Waves: Diverse Genomic and Transmission Landscapes Over 15 Months of the COVID-19 Pandemic in Pune, India
2022-11-11 epidemiology 10.1101/2022.11.05.22281203
#1 (1.8%)
Show abstract

The modern response to pandemics, critical for effective public health measures, is shaped by the availability and integration of diverse epidemiological outbreak data. Genomic surveillance has come to the forefront during the coronavirus disease 2019 (COVID-19) pandemic at both local and global scales to identify variants of concern. Tracking variants of concern (VOC) is integral to understanding the evolution of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in space and time. Co...

16
Whole-exome sequencing on 6215 school-aged children reveals the importance of genetic testing in high myopia
2023-06-20 ophthalmology 10.1101/2023.06.15.23291239
#1 (1.7%)
Show abstract

ImportanceHigh myopia (HM) is one of the leading causes of visual impairment and blindness worldwide. It is well-known that genetic factors play a significant role in the development of HM. Early school-aged population-based genetic screening and treatment should be performed to reduce HM complications. ObjectiveTo identify risk variants in a large HM cohort and to examine the implications of universal genetic testing of individuals with HM with respect to clinical decision-making. Design, set...

17
Associations between RetNet gene polymorphisms and efficacy of orthokeratology for myopia control : sample from a clinical retrospective study
2024-09-19 ophthalmology 10.1101/2024.09.18.24313851
#1 (1.6%)
Show abstract

BackgroundTo study how clinical and genetic factors control the effectiveness of orthokeratology lenses in myopia. MethodsIn this study, we conducted a retrospective clinical study of 545 children aged 8-12 years with myopia who were wearing orthokeratology lenses for one year and performed whole-genome sequencing (WGS) for 60 participants in two groups, one with rapid axial length progression of larger than 0.33 mm and the other with slow axial length progression of less than 0.09 mm. Genes in...

18
Exploring the Shared Genetic Architectures between Primary Open-Angle Glaucoma and Visual Pathway Regions in the Brain
2025-06-06 ophthalmology 10.1101/2025.06.05.25329020
#1 (1.6%)
Show abstract

PurposeTo investigate the genetic relationships between primary open-angle glaucoma (POAG) and major visual pathways in the brain to better understand the neurological biology of glaucoma, which may facilitate the discovery of neuroprotective drug targets. MethodsWe assessed the relationship between POAG and the volumes of five visual pathway regions using genetic correlation and polygenic risk score (PRS). We further used Mendelian randomisation (MR) to investigate the causal relationships. In...

19
Confirmation of the MIR204 n.37C>T heterozygous variant as a cause of chorioretinal dystrophy variably associated with iris coloboma, early-onset cataracts and congenital glaucoma
2023-02-11 ophthalmology 10.1101/2023.02.09.23284763
#1 (1.6%)
Show abstract

Four members of a three-generation family with early-onset chorioretinal dystrophy were shown to be heterozygous carriers of the n.37C>T in MIR204. The identification of this previously reported pathogenic variant confirms the existence of a distinct clinical entity caused by a sequence change in MIR204. The chorioretinal dystrophy was variably associated with iris coloboma, congenital glaucoma, and premature cataracts extending the phenotypic range of the condition. In silico analysis of the n....

20
Common tandem repeat variants associated with glaucoma risk in individuals of African ancestry
2025-02-21 ophthalmology 10.1101/2025.02.19.25322489
#1 (1.6%)
Show abstract

The contribution of common tandem repeats (TR) variants to common, complex disease remains unknown, especially in populations historically underrepresented in genetic research. We identified common TR variants associated with risk of primary open-angle glaucoma (POAG) in individuals of African ancestry. The POAG-associated TR variants were predominantly found at Alu poly(A) tail elements, regions, retinal development enhancers, and harbor binding sites of a POAG-associated transcription factor, ...